-
Notifications
You must be signed in to change notification settings - Fork 73
IDE sample of "unsupported sources"->DataFrame #1231
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
2b58d6c
to
738cd46
Compare
738cd46
to
0e16817
Compare
80a82ae
to
46128e8
Compare
408b0a5
to
05fd49e
Compare
…e examples clearer and more reproducible
# Conflicts: # settings.gradle.kts
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great examples!
I also think we need to add FAQ or something with all these external sources examples mention.
In README and website.
* This can be useful for storing matrices for easier access later or to simply organize data read from other files. | ||
* For example, MRI data is often stored as 3D arrays and sometimes even 4D arrays. | ||
*/ | ||
fun main() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like we could claim it as a NumPy array source reader via Multik
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Within limits, yes. Multik cannot read Fortran-contiguous numpy arrays for instance. Something I came across when looking for MRI data.
// created by Customers.toDataFrameSchema() | ||
// The same can be done for the other tables | ||
@DataSchema | ||
data class CustomersDf( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe CustomersDFSchema?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that would be a bit verbose wouldn't it? ;P "Simply cast it to the CustomersDfSchema DataSchema and you're good to go"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sometimes I want to name schema interface/data class as "...Schema" too. Let's discuss it and make some conventions 😄!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Noo please don't. If you have a DataFrame<Person>
it's clear you use Person
as the schema for the dataframe, right? Adding Schema
everywhere would just make it more verbose IMO. Remember that each data schema is already annotated with @DataSchema
, plus the compiler plugin makes them all extend DataRowSchema
as well. So you would get a "schema"-overload:
@DataSchema
data class MySchema(val a: Int): DataRowSchema
val df: DataFrame<MySchema>
not to forget auto-generated data schemas based on JDBC/openAPI etc. using their own names. Would those break the convention?
...ported-data-sources/src/main/kotlin/org/jetbrains/kotlinx/dataframe/examples/exposed/main.kt
Show resolved
Hide resolved
* @see toDataFrameSchemaWithNameNormalizer | ||
*/ | ||
@Suppress("UNCHECKED_CAST") | ||
fun Table.toDataFrameSchema(columnNameToAccessor: MutableMap<String, String> = mutableMapOf()): DataFrameSchema { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very good proposal for KDF-Exposed integration module
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left some notice about possible renaming
Fixes #1215
Adds examples/guides in idea projects for using DataFrame +:
Updates README and documentation: